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ABSTRACT 

Background Injuries are often recurrent, with 
subsequent injuries influenced by previous occurrences 
and hence correlation between events needs to be taken 
into account when analysing such data. 
Objective This paper compares five different survival 
models (Cox proportional hazards (CoxPH) model and 
the following generalisations to recurrent event data: 
Andersen-Gill (A-G), frailty, Wei-Lin-Weissfeld total time 
(WLW-TT) marginal, Prentice-Williams-Peterson gap time 
(PWP-GT) conditional models) for the analysis of 
recurrent injury data. 

Methods Empirical evaluation and comparison of 
different models were performed using model selection 
criteria and goodness-of-fit statistics. Simulation studies 
assessed the size and power of each model fit. 
Results The modelling approach is demonstrated 
through direct application to Australian National Rugby 
League recurrent injury data collected over the 2008 
playing season. Of the 35 players analysed, 14 (40%) 
players had more than 1 injury and 47 contact injuries 
were sustained over 29 matches. The CoxPH model 
provided the poorest fit to the recurrent sports injury 
data. The fit was improved with the A-G and frailty 
models, compared to WLW-TT and PWP-GT models. 
Conclusions Despite little difference in model fit 
between the A-G and frailty models, in the interest of 
fewer statistical assumptions it is recommended that, 
where relevant, future studies involving modelling of 
recurrent sports injury data use the frailty model in 
preference to the CoxPH model or its other 
generalisations. The paper provides a rationale for future 
statistical modelling approaches for recurrent sports 
injury. 
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INTRODUCTION 

Sports injuries are often recurrent in that some 
people experience more than one sports injury over 
time. There is wide recognition that subsequent 
injury (of either the same or a different type) can 
be strongly influenced by previous injury occur- 
rences. ^""^ Such recurrent injuries are unlikely to be 
statistically independent, and appropriate statistical 
methods need to be used to analyse such data 
accurately.^"^ While different modelling approaches 
have been used to report recurrent event data, such 
as modelling the w^ithin-person total number of 
events or time to the first event, they have often 
been naive in the statistical sense in that they do 
not take correlation between events into account or 
have excluded important detailed information 
about the subsequent events.^ Over the last decade, 
there have been some significant statistical advances 
in the modelling of recurrent event data.'^ 
While there has been some application to health 



data,^ these methods are yet to be reported in 
sports medicine applications. This means that many 
models of the likelihood of recurrence of sports 
injury, or for understanding causal relationships 
when conditions can be recurrent, could be flaw^ed, 
leading to incorrect information being used to 
inform prevention priorities and programmes. 

A key statistical challenge inherent in analysing 
recurrent injury data is that the probability of 
injury occurrence is likely to be influenced by 
earlier injuries, even when they are not of exactly 
the same type; this can be manifest as an injury 
either raising or low^ering the rate of further injury. 
This is important because analyses that incorrectly 
treat different w^ithin-person injuries as statistically 
independent run the risk of generating misleading 
results. Ignoring potential w^ithin-person event 
dependency leads to reported greater precision 
than is w^arranted and possible biasing of results 
away from the null. A second statistical issue is that 
many naive statistical approaches implicitly restrict 
the baseline probability of injury, and the influence 
of covariates on this, to be the same across all injur- 
ies w^hen, in fact, they vary across people and dif- 
ferent injury types. Across people, this variability 
implies that some w^ill have inherently higher or 
loM^er rates of different subsequent injuries. 
Together, these statistical issues mean that in any 
recurrent injury dataset there w^ill be different 
w^ithin-person correlations across people and that 
the w^ithin-person injury times w^ill be dependent. 
Any correlation among injuries (w^hether produced 
by event dependence or variability) w^ill violate 
assumptions that the timing of injuries is independ- 
ent, and result in problems of estimation and incor- 
rect inference if not properly taken into account. 

Despite many studies documenting the incidence 
of sports injuries, and recognition of the recurrent 
nature of many injuries, appropriate statistical 
modelling for recurrent sports injuries has largely 
been absent from pubHshed studies. In general, sub- 
sequent sports injury has been handled statistically 
in one of three w^ays. The majority of cohort 
studies have reported Poisson counts and calculated 
injury rates as the total number of injuries per unit 
time, even w^hen many players contribute more 
than one injury occurrence to the numerator. 
Inherently, such calculations treat all injuries w^ithin 
given players as independent. When these studies 
have recognised that injury history can predict 
injury risk, they have adjusted for it in regression 
models by including a dichotomous predictor 
representing 'previous injury history? (yes/no)'. On 
the rare occasion when researchers have recognised 
w^ithin-player injury dependency, they have only 
modelled the time to first injury and have excluded 
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valuable information about any subsequent injuries from consid- 
eration/^ To progress recurrent sports injury epidemiology, 
there is a need for guidance in the most appropriate statistical 
models for these data. 

Several event history model variations based on the Cox pro- 
portional hazards (CoxPH) model^^ have been proposed for the 
analysis of repeated events but their application leads to differ- 
ent results because of the different assumptions they make about 
the data they are modelling.^ 18-24 practice, the choice of 
the most appropriate model depends upon the: (a) distribution 
of subsequent event times; (b) w^ithin-person correlation of sub- 
sequent events; (c) frequency of the recurrent events; and (d) 
the specific research question being posed at the time (eg, esti- 
mation of population-level effects of covariates as averaged 
across people or describing event dependency w^ithin 
people).^ A major statistical consideration is therefore how to 
address both the players at risk and the subsequent injuries 
appropriately. 

In the general recurrent event literature, extensions of the 
CoxPH model are popular because they enable all events for 
each individual to be analysed. Application of four prominent 
regression models (Andersen-Gill (A-G),^^ frailty,^^ 
Wei-Lin-Weissfeld total time (WLW-TT) marginal model,^^ 
Prentice-Williams-Peterson gap time (PWP-GT) conditional 
modeP^) yield different results because of their different under- 
lying assumptions. To our know^ledge, these models have not 
been previously applied and compared in sports injury epi- 
demiological studies and so it is currently unknow^n w^hich of 
these is the most suitable for modelling recurrent sports injuries. 

The aim of this paper is to (a) summarise the issues that need 
to be considered w^hen modelHng recurrent sports injury data 
v^here the time before/betw^een injuries is of interest and (b) to 
assess and compare the suitability of the CoxPH model and its 
extensions for modelling such data. The methods and model 
comparison are demonstrated on Australian National Rugby 
League (NRL) injury data to provide defensible guidance on 
how to appropriately model recurrent sports injury data. 



METHODS 
The data 

To demonstrate and compare the applicability of different exten- 
sions of the CoxPH model to a real-w^orld data example, injury 
data w^ere obtained on 35 players from a professional rugby 
league club competing in the 2008 Australian NRL competition. 
Injury and participation data were collected from 29 matches 
(including all trial, fixture and finals matches). Injuries were 
defined as conditions associated w^ith pain or disability that 
occurred during match participation, irrespective of the need 
for first aid, medical attention or time loss.^^ In the context of 
this paper, a recurrent injury was said to have occurred if a 
person sustained more than one injury over the 29 matches, 
irrespective of w^hether or not it was to the same body region or 
of the same type. For this paper, only data on all contact injuries 
(defined as those resulting from tackling, being tackled, collision 
and accidental contact) w^ere extracted. All players received a 
clear explanation of the study, including the risks and benefits of 
participation and w^ritten consent was obtained. The study was 
approved by the University of Queensland Human Ethics 
Committee. 

The models 

The CoxPH model and four recurrent event generalisations 
were applied to the sports injury data. The time variable was 
taken to be the match number (range 1-29). Major statistical 
challenges w^ith this sort of data are how^ to address the number 
of recurrent events and the number of players at risk appropri- 
ately. Four components w^ere considered for the recurrent event 
model :^ (a) risk interval w^hich defines w^hen a player is at risk 
of having an injury along a given timescale and determines 
w^hether a model is either marginal or conditional; (b) risk set 
or the number of players included in the set at a given point in 
time; (c) an event-specific or common baseline hazard; and (d) 
handling of w^i thin-subject correlation. Table 1 summarises how^ 
these components are defined for each of the four models con- 
sidered in this paper. 



Table 1 Statistical specifications and assumptions in relation to the risk interval, risk set, baseline hazard and within-person correlation in the 
extended Cox proportional hazards (CoxPH) models 



Components Andersen-Gill (A-G) 



Frailty 



Wei-Lin-Weissfeld total time (WLW-TT) Prentice-Williams-Petersen gap 
marginal time (PWP-GT) conditional 



Risk interval 

Risk set for 
injury k attime t 



Baseline hazard 



Within-person 
correlation 

Comment 



Duration since starting 
observation 

Independent injuries (any 
given injury occurrence is 
not affected by previous 
injuries) 



Common/same baseline 
hazard across all injuries 



The within-person injuries 
are independent 

A-G model is 
recommended when there 
is no injury dependence 
and no covariate/injury 
effects 



Duration since starting observation Duration since starting observation 



Duration since previous injury 



A random effect (or frailty) term is 
used to account for the within-player 
correlation between injuries to enable 
modelling of the phenomenon by 
which some players are intrinsically 
more or less susceptible to 
experiencing a given injury than 
others are 

Heterogeneity is directly incorporated 
via a random effect so that the 
baseline hazard is allowed to vary 
with each injury 

Captures within-person correlation due 
to both injury dependence and 
heterogeneity 

The frailty approach accounts for 
heterogeneity. The random effect (the 
frailty) has a multiplicative effect on 
the baseline hazard function and the 
mixture of individuals with different 
injury risks 



All players who have not experienced injury All players who have experienced 



k at time t 



Common baseline hazard for all injuries 
within a player 



The within-person injuries are independent 



At any time point (matches), WLW-TT 
describes all players who have not yet 
experienced k injuries are assumed to be at 
risk for the kth injury which is not realistic 
in the sports setting injury data 



injury k-1, and have not experienced 
injury k at time t 



Stratifies the data by injury so that the 
baseline hazard is allowed to vary 
with each injury 

The current injury is unaffected by 
earlier injuries that occurred to the 
player 

PWP-GT model takes into account the 
ordering of events 
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The following three risk intervals were considered: (a) gap 
(or interoccurrence) time representing the time from the prior 
injury event and not relative to the actual timeline of observa- 
tion; (b) calendar time which uses the same timescale for all 
events, referenced to a fixed point in time, but does not allow 
an overlap in risk periods across events for a given player; and 
(c) total time representing the time from the start of the player 
follow-up. In each case, the interval ends with the current 
injury. In both the gap time and calendar time representations, 
the player is at risk for the same length of time. Gap and calen- 
dar time models are conditional since a player is at risk of a new 
injury, conditioned on having sustained a previous injury. For 
total time, the clock does not reset for each event and the begin- 
ning of each event is at the same point in the observation time- 
line; risk periods for different events for the same player 
overlap. Total time models are marginal since the player is at 
risk from the start of play, independent of any previous injury. 
Irrespective of the definition, the risk interval for the first injury 
is the same. 

Figure 1 describes these risk intervals in more detail, through 
the specific examples of three players (figure lA). In the gap 
time representation, after an injury event, the player resumes 
play again at time 0 and the time to the next event corresponds 
to the number of matches that it takes for that player to experi- 
ence the next event. The occurrence of all events after the first 
is modelled on a timescale relative to the prior event and not 
relative to the actual timeline of observation. Thus, the gap time 
(figure IB) for our example indicates that player A is at risk of 
his first injury during 0-2 matches, and of his second, third and 
subsequent injuries during 0-18 matches, 0-6 matches and 0-1 
match, respectively. In the calendar time (figure IC), player A is 
at risk for his first injury event during 0-2 matches, and his 
second, third and subsequent injuries during matches 2-20, 20- 
26 and 26-27, respectively. The total time (figure ID) indicates 
that player A is at risk for his first, second and subsequent injur- 
ies during 0-2, 0-20, 0-26 and 0-27 matches, respectively. 



The Kaplan-Meier (K-M) method is used to estimate the sur- 
vival function non-parametrically from observed (censored and 
uncensored) survival times.^^ The CoxPH model with time to 
first injury event as the outcome variable is a regression model 
used to estimate the survival probability after adjusting for both 
baseline hazard and explanatory variables. This model counts 
the players at risk at the time of this first event, after which they 
are no longer considered to be at risk. The result is an estima- 
tion of the probability of remaining free of injury for a given 
point in time based on the observed injuries. The steps in the 
K-M curves show changes in the probabilities of remaining free 
of injury for various matches across the group of players, when 
first injuries occur in new players. 

The A-G model is a simple extension of the CoxPH model 
where players contribute to the risk set for an event as long as 
they are under observation at the time the injury occurs and 
share the same baseline hazard function. However, the A-G 
model requires the strongest statistical assumptions including 
that of an independent increment in which any given injury 
occurrence is not affected by previous injuries, that is within 
players, injuries are independent. This restriction means that 
injury dependence cannot be included and the A-G model 
inherently assumes that injuries do not change the player and 
that the player does not learn from previous injuries. Moreover, 
this model does not allow investigation of effects that might 
change based on injury-specific covariate effects, but there is the 
possibility of incorporating injury dependence via time- 
dependent covariates. Given these Hmitations, the A-G model is 
recommended when there is no injury dependence and no cov- 
ariate/injury effects. 

Analysis of recurrent injury data frequently assumes that the 
player injury histories are all statistically independent (at least 
conditionally on observed time-fixed covariates) so that the 
interoccurrence times appear in an independent manner. 
However, some players are intrinsically more or less susceptible 
to experiencing an injury than others. The frailty model is 



I Censored ® Injury 



Player C 



Player B 



~\ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 
No Of matches 
1 (a) - Three players with recurrent injuries 




12 14 16 18 20 22 24 26 28 30 
No of matches 
1(b) - Gap time 




10 12 14 16 18 20 
No of matches 
1 (c) - Calendar time 




10 12 14 16 18 20 22 24 26 
No of matches 
1 (d) - Total time 



Figure 1 Illustrations of the risk interval formulations: (A) three players with recurrent injuries; (B) gap time; (C) calendar time; (D) total time. A 
circle (•) indicates an injury event and a solid square (■) indicates censoring. Each time to an event or censoring is a separate risk interval. 
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characterised by its inclusion of a random effect, or frailty, term 
can account for the within-player correlation between injuries. 
If the frailty is less than 1, a player tends to experience the 
injury at a later time than another player, whereas the opposite 
occurs if the frailty is greater than 1 . 

The WLW-TT model is a marginal model and assumes a 
common baseline hazard for all injuries within a player. 
Marginal models consider the marginal distribution of each 
failure time and impose no particular structure of dependence 
among distinct failure times on each player. Each recurrence is 
modelled as a different stratum and each stratum is treated as 
marginal data. This model is marginal with respect to the risk 
set since each player is at risk from the beginning of the study 
and can be at risk for several events simultaneously. 

The PWP-GT model is a conditional model which allows for 
event dependence via stratification by event number so that dif- 
ferent events can have different baseline hazards. The main dif- 
ference to the marginal model is that a player cannot be at risk 
for the later injury until a prior event occurs. This conditional 
model preserves the order of sequential injuries in the creation 
of the risk set and therefore incorporates injury dependence. 
The PWP-GT model is estimated with the data organised in 
interoccurrence/gap time (ie, gap time risk set or time since the 
previous injury). 

Model estimation and evaluation 

The outcome being modelled is the probability of remaining 
injury- free over the 29 matches. As shown in table 1, different 
model formulations handle the time variable in relation to 
injury occurrence differently. All models were fitted using the 
cph function of the Design package within R (Version 
2.12.2).^^ The strata, cluster and frailty functions were used 
to fit the extended CoxPH models. The proportional hazard 
assumption test was performed using the cox.zph command. All 
models were adjusted by age, match experience and body mass 
of the players as known confounders of injury risk in NRL 
players. The R code is available from the authors, upon request. 

K-M curve representations of the observed probability of 
remaining free of injury were used to provide a visual compari- 
son of each model fit. The log likelihood (LL), Akaike informa- 
tion criterion (AIC) and Bayesian information criterion (BIG) 
were used to compare the goodness of fit of the fitted models in 
terms of fitting the observed data.^"^ A lower AIG or BIG indi- 
cates a better fit to the observed data and two models can be 
compared by comparing the differences in the AIG or BIG, with 
preference being given to the model with the smallest criterion 
measure. A simple rule of thumb is that models are not differ- 
ent if the difference in AIG is less than 2; there is minor evi- 
dence of difference when the AIG ranges from 2 to 4, and there 
is strong evidence for a difference with the AIG difference is 
more than 10. When comparing BIG, differences <2 are consid- 
ered weak, those >2 but <6 are positive, those >6 but <10.0 
are strong and BIG differences >10 are very strong.^^ 

Model accuracy 

The most common criterion for evaluating the performance of a 
statistical model is its accuracy in terms of data fit. In this sense, 
the model accuracy is an assessment of the closeness of estimates 
to the exact (or observed) value and can be computed on a 
point-by-point basis. The most widely used measures of accur- 
acy are the mean-squared error (MSE), the root MSE, the mean 
absolute error and the mean absolute percentage error.^^ 
Smaller values of each of measure indicate more accurate and 



rehable models. Eurther details about these measures can be 
found in Hyndman and Koehler.^^ 

Comparing the models 

Three test criteria were used to compare the fitted models: like- 
lihood ratio (LRT), E"^^ and bootstrap tests."^^ All tests 
compare two models where one model is an extension to the 
other (ie, the models are nested, with the simpler model being 
contained as a subset of the more complex one). Eor example, 
the A-G model is nested within the frailty model and compari- 
son of the two models can test if there are random effects com- 
ponents for recurrent events that need to be modelled (as 
considered by the frailty model, but not the A-G model). 

As an example, the LRT begins with a direct comparison of 
the likelihood scores of the two models and tests whether the 
frailty is necessary for analysing recurrent sports injury events. A 
significant LRT suggests that a random effect (frailty) accounts 
for the within-player correlation between injuries. A similar 
approach is used for the F-ratio test. 

In the bootstrapping procedure, a large number of random 
samples are generated."^ ^ The observed test statistic is then com- 
pared with the test statistics calculated from the bootstrap 
samples. Although there are many ways to use the bootstrap for 
hypothesis testing, the method of Walters and Gampbell^^ for 
computing a bootstrap p-value corresponding to the observed 
value of a test statistic Twas used. 

Simulation framework 

Einally, a simulation approach for calculating the size and power 
of the models was undertaken."^^ One hundred sets of injury 
data were simulated from the exponential distribution and the 
bootstrapping procedure was appHed 100 times to each gener- 
ated dataset to obtain the significance level of the test. Within 
the context of model selection, power and size estimates are 
based on the proportion of replications that indicate acceptable 
fit, with greater numbers of replications resulting in smaller GIs 
(higher power, more accuracy) around the estimates. 
Simulations were run on a bi-processed Pentium 4 machine with 
a 3.20 GHz processor and 2.0 Gb RAM memory. The data- 
generating process was performed using the SimSurv function 
of the prodlim package"^"^ from R version 2.12.2,^^ operating on 
a Windows XP professional platform. 

RESULTS 

Overall, 47 contact injuries were sustained by the 35 players 
during a total of 557 player appearances. The median follow-up 
time was 18 matches (range 1-29 matches). More than half of 
the players (54.3%) sustained 1-6 injuries, with 40% sustaining 
>1 injury over the 29 matches (table 2). The most common site 
of injury was the head and neck (26% of all injuries). The inci- 
dence of injury was similar for the shoulder, thigh, calf and 
knee (13%). Sprains (32%), contusions (26%) and haematomas 
(17%) were the most common type of injury. The majority of 
injuries occurred while tackling or being tackled (47%). 

Eigure 2 shows the timing of the incidence of each injury 
event in relation to the total number of matches (each of 80 
min or 1.33 h duration). Gensoring sometimes occurred when a 
player (a) had not experienced the relevant outcome, by the end 
of the season; (b) was lost to follow-up; or (c) experienced a dif- 
ferent event that made further follow-up impossible. The data 
structure shows the complex nature of the recurrent injuries in 
that, in some players, several injuries occurred and the time 
between injuries also differed across players. 
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Table 2 The distribution of number of injuries sustained by 35 
National Rugby League players, the respective number of matches 
with a number of injuries and the injury incidence rates per 1000 
matches 

Total Total Injury 

Number Number number of Proportion number of incidence 
of injuries of players injuries of players matches rates 



0 16 - 45.7 134 

1 5 5 14.3 107 46.7 

2 7 14 20.0 133 105.3 

3 2 6 5.7 55 109.1 

4 4 16 11.4 108 148.1 

5 - - - - - 

6 1 6 2.9 20 300.0 
Total 35 47 557 



The p-values from the pairwise LRT, F and bootstrap model 
comparison tests are shown in table 5. The estimated p-values 
comparing the A-G and frailty models, all being >0.05, show 
that these models are indistinguishable and either model could 
be used for analysing the recurrent sports injuries in this paper. 
There was no significant difference between the A-G and 
PWP-GT models but the A-G model was statistically different 
from the WIW-TT model. The CoxPH was significantly differ- 
ent only to the frailty model. 

Table 6 shows that the bootstrap simulation tests were per- 
formed satisfactorily for each pair of models. However, the 
actual sizes and powers were slightly different from the simu- 
lated model sizes and powers. For example, the simulated model 
size superseded the actual size and simulated model power pre- 
ceded the actual power at 10% level of significance, when the 
WLW-TT and PWP-GT models were considered. 



Figure 3 shows the K-M survival curves as a means of com- 
paring the CoxPH model and its generalisations. The K-M 
curves estimate the probability of remaining free of injury at a 
given point in time based on observed recurrences. All players 
were free of injury at the beginning of the season and survival 
rates were lower after every match, during which injuries 
occurred. Almost 95% of players were injured by the end of 29 
matches. The Cox proportional regression model provided the 
best fit for the first few matches and the worst fit for the remain- 
ing matches. This is not surprising given that the CoxPH model 
only considers the time to the first injury. The fit was also rela- 
tively poor for the WLW-TT and PWP-GT models. The A-G^^ 
and frailty models provided the best fit for modelling recurrent 
sports injury data. 

Table 3 shows the LL, AIC and BIC criteria for the fitted 
models. The AIC and BIC results provide strong evidence that 
both the A-G and frailty models perform better than the 
WLW-TT, PWP-GT and CoxPH models. Although the differ- 
ences in LL were indistinguishable, the A-G and frailty models 
showed minor AIC differences but strong BIC differences in 
fitting the recurrent events. 

Table 4 compares the fit of the five models to the injury data 
according to the four accuracy measures. On all measures, the 
CoxPH model had a poorer fit than each of its extensions. 
Although there was little difference in fit between the two, the 
frailty model performed a little better than the A-G model. 
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Figure 2 Recurrent injury history of 35 professional rugby league 
players. The event of interest is any contact-injury sustained by a 
player, which is denoted by a circle (O)- Censored data which arise 
when the outcome injury status is either not-injured or unknown is 
denoted by solid squares (■). 



DISCUSSION 

Knowing how to choose the best model for analysing recurrent 
events in sports injury settings is important for the generation of 
accurate and reliable information to guide priority setting for 
targeting of intervention investments to tackle the sports injury 
problem. Although there are some guidelines on how to appro- 
priately model injury count data,^^ little prior attention has 
been paid to the analysis of recurrent injury data. A recent con- 
ceptual model has described how and why recurrent injuries are 
a problem in the sports injury context, but gives no guidance on 
how to analyse such data.^^ Analysis of recurrent sports injury is 
complex and researchers interested in this are advised to collab- 
orate with a statistician. 

The need to correctly statistically model recurrent events 
occurs in many clinical trials, longitudinal epidemiological 
studies and sociological research. 47-51 ^pQj-^g injury studies 
often report recurrent events because players can experience 
more than one injury event over a playing season. Sports 
injury prevention is dependent on players' ability to tolerate 
repeated exposures to injury risks while being active in their 
sport. In terms of injury risk, it is likely that some of the risk 
factors for a subsequent injury will also be implicated in the 
initial injury. However, these injuries could also occur because 
an injured player continues to participate in their sport with 
some modification of their techniques, physical adaptation or 
mal-adaptation, complete/incomplete recovery from injury or a 
combination thereof. This means that their risk of further injury 
will no longer be the same as for their first injury. 

In the sports injury literature to date, recurrent injuries have 
been considered from a clinical management and return-to-play 
(or time away from sport to recover from injury) perspective.^" 
3 15 53 54 5pQj-^5 injury surveillance guidelines and several con- 
ceptual papers describe the complex issues associated with prop- 
erly classifying injuries as recurrent, re-injury, exacerbations or 
overuse.^ 55-58 ^^^^ Qf prior work, however, has discussed 
recurrent injuries from a statistical viewpoint, and so adequate 
recognition of the various dependencies both within and 
between injured players is lacking in the sports medicine 
literature. 

In the case of injury count data, sports injury counts have 
been most commonly analysed in the literature as Poisson 
counts. When players would reasonably be expected to sustain 
more than one injury, it would be more correct to apply nega- 
tive binomial models, as we have shown when modelling falls in 
older people."^^ The present study offers a comprehensive 
approach to guide the choice of different survival models when 
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Figure 3 Standard Kaplan-Meier (K-M) curves for probability of remaining free of injury for 35 professional rugby league players. Actual and fitted 
survival curves from (A) CoxPH model, (B) A-G model, (C) frailty model, (D) WLW-TT model and (E) PWP-GT model. The grey shaded regions are 
95% CIs for the fitted survival curves. Models were adjusted by age, match experience and body mass of the players. 



modelling recurrent sports injury data when the time to events 
is of prime interest, rather than only an overall count. 

Although the CoxPH model is the most commonly used 
approach for analysing time-to-event data, it fails to take into 
account the extra variability of the recurrent events and, as this 
paper has show^n, provides only poor fit to recurrent sports 
injury data. This is perhaps not surprising given that it only con- 
siders the time to the first injury and discards the remaining 



injuries. This is a critical limitation because it means that 
important information about injury occurrence and associated 
risk factors is potentially excluded from current models w^hich 
only consider the time to first injury. 

Each of the four tested generalisations of the CoxPH model 
(A-G, frailty, WLW-TT and PWP-GT models) provided a substan- 
tial model improvement over the CoxPH model. In general, the 
A-G and frailty models performed best and provided better data 



Table 3 Model selection criteria (log likelihood (LL), Akaike 
information criterion (AlC) and Bayesian information criterion (BIC)) 
of the fitted models for sports injury recurrent data* 





Model 


selection 


criteria 


Model 


LL 


AlC 


BIC 


Andersen-Gill (A-G) 


135.0 


275.9 


355.6 


Frailty 


134.9 


277.9 


378.0 


Wei-Lin-Weissfeld total time (WLW-TT) marginal 


158.1 


334.2 


487.6 


Prentice-Williams-Petersen gap time (PWP-GT) 


154.8 


327.7 


481.1 



conditional 



*The LL, AlC and BIC were not reported due to the small estimated likelihood for the 
CoxPH model for only the first event. 



Table 4 Mean square error (MSE), root mean-squared error 
(RMSE), mean absolute error (MAE) and mean absolute percentage 
error (MAPE) of the fitted models for sports injury recurrent data 

Model accuracy measures 
Model MSE RMSE MAE MAPE 



Cox proportional hazards (CoxPH) 0.04 0.19 0.15 0.64 

Andersen-Gill (A-G) 0.001 0.04 0.03 0.13 

Frailty 0.001 0.03 0.03 0.12 

Wei-Lin-Weissfeld total time (WLW-TT) marginal 0.03 0.18 0.15 0.64 

Prentice-Williams-Petersen gap time (PWP-GT) 0.01 0.10 0.09 0.47 
conditional 
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Table 5 Pairwise goodness-of-fit (likelihood ratio test (LRT), 
F-ratio (F) and bootstrap (BS)) p-values for comparing the Cox 
proportional hazards (CoxPH) model, Andersen and Gill (A-G) 
model, frailty model, Wei-Lin-Weissfeld total time (WLW-TT) 
marginal model and Prentice-Williams-Petersen gap time (PWP-GT) 
conditional model for sports injury recurrent data 

Goodness-of-fit p values 
Comparison of models LRT* F BSt 

CoxPH vsA-Gt - - - 



CoxPH vs frailty 




<0.001 




CoxPH vs WLW-TT 




0.67 




CoxPH vs PWP-GT 




0.99 




A-G vs frailty 


0.84 


0.50 


0.85 


A-G vs WLW-TT 


0.03 


0.03 


0.02 


A-G vs PWP-GT 


0.20 


0.08 


0.14 


Frailty vs WLW-TT 


0.02 


0.02 


<0.001 


Frailty vs PWP-GT 


0.03 


0.02 


0.01 


WLW-TT vs PWP-GT+ 






0.78 



*LRT test is based on log likelihood and is not appropriate for comparing first event 
model (CoxPH model) and recurrent events models (Cox extension models). 
tThe resampling procedure was based on the CoxPH model in the BS test and hence 
the extended models were not fitted for first event only when compared with the 
CoxPH model. 
^Models are not nested. 

fits to the recurrent sports injury data when compared to both 
WLW-TT and PWP-GT models. 

There was no statistical difference between the A-G and frailty 
models when applied to the NRL recurrent injury data analysed 
in this study in terms of model selection, goodness of fit or accur- 
acy. This was confirmed with the simulation substudy. 
Nonetheless, as the frailty model requires fewer data assumptions 
than the A-G model and it does allow investigation of effects that 
might change based on injury-specific covariate effects (which 
the A-G model does not), it is recommended that the frailty 
model be adopted when analysing recurrent sports injury data in 
the future, when this is consistent with the research question. 

The A-G model is the most simple variance-corrected model 
and incorporates robust variance estimators, which have good 
statistical properties under some circumstances. This model can 



also be used to adjust for covariate effects. The frailty model 
includes a random effect (frailty) to account for the within- 
subject correlation between injuries and so is a more general 
model, with fewer assumptions. In the case where there is sig- 
nificant within-person correlation (as applies to our injury data), 
Kelly and Lim^ recommend the use of frailty models, which 
incorporate random effects because they fit the data better than 
the PWP-GT model. 

The statistical model comparison was only conducted on a 
small injury sample, and it is possible that different conclusions 
may arise when applied to other injury contexts. We have recently 
applied the frailty model to other rugby league injury data, 
including for the purposes of risk factor identification, indicating 
its likely robustness for this sort of recurrent injury data.^^ 

Although the frailty model has offered the best fit to the rugby 
league recurrent injury count data in this study, this does not guar- 
antee that this model would offer the best fit for other sports 
injury data sets, and this would need further exploration. 
Nevertheless, the fitting procedures presented in this paper, and 
the various model selection criteria, may be used as guidelines for 
modelling recurrent injury data in other sports injury contexts. 

In conclusion, sports injury data characterised by recurrent 
events due to repeat or subsequent injuries over a period of 
time need to be appropriately analysed to take into account the 
different likely dependences within the data. Such data can be 
appropriately analysed by either the A-G or frailty model, with 
the frailty model representing a marginally better fit than A-G 
model. The strength of the frailty model is that it considers indi- 
vidual baseline injury risks for different players, makes fewer 
statistical assumptions and also is able to model time-varying 
covariate s. 



What this study adds 



► A summary of the important statistical considerations when 
analysing recurrent injury data. 

► Guidance on the best statistical model to use for analysing 
recurrent sports injuries. 



Table 6 Simulated estimates (based on 100 simulation 
replications) of the size and power of the test to compare Andersen 
and Gill (A-G) model, frailty model, Wei-Lin-Weissfeld total time 
(WLW-TT) marginal model and Prentice-Williams-Petersen gap time 
(PWP-GT) conditional model fitted to sports injury recurrent data* 





Simulated model size 


Simulated model power 


Comparison 
ofmodels 


Pr(P>a): 






Pr(P>p)=1- 






a=0.01 


a=0.05 


a=0.10 


1-P=0.99 


1-P=0.95 


1-P=0.90 


A-G vs frailty 


0.02 


0.04 


0.08 


0.99 


0.99 


0.92 


A-G vs WLW-TT 


0.03 


0.08 


0.13 


0.97 


0.93 


0.88 


A-G vs PWP-GT 


0.03 


0.05 


0.11 


0.99 


0.96 


0.93 


Frailty vs 
WLW-TT 


0.02 


0.05 


0.10 


0.96 


0.90 


0.86 


Frailty vs 
PWP-GT 


0.01 


0.04 


0.09 


0.98 


0.93 


0.89 


WLW-TT vs 


0.01 


0.04 


0.18 


0.98 


0.90 


0.80 



PWP-GT 



*The re-sampling procedure was based on the Cox model in the bootstrap test and 
hence the extended models were not fitted for first event only when compared with 
the Cox regression model. 
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